Techniques for Retrieving Documents Using an Image Capture Device

ABSTRACT

Embodiments of the present invention provide techniques for retrieving electronic documents based upon images captured using an image capture device. One or more images captured by a user using an image capture device are used to search a set of documents to retrieve one or more documents that match the search query. The one or more documents retrieved by the search may then be provided to the user or some other recipient.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application is a continuation of and claims priority from U.S.patent application Ser. No. 10/957,080 filed on Oct. 1, 2004, thecontents of which are incorporated by reference herein in their entiretyfor all purposes.

BACKGROUND OF THE INVENTION

The present invention relates to document retrieval techniques, and moreparticularly to techniques for retrieving electronic documents basedupon images captured using an image capture device.

The use of image capture devices such as cameras, mobile phones equippedwith cameras, etc. has seen widespread use in recent times. For example,mobile phone cameras are becoming ubiquitous in many parts of the world.Image capture devices such as mobile phones are also becoming powerfulcomputing engines in their own rights with significant storage capacity,ability to run powerful productivity software, wireless networkingcapabilities, and the like. The existence and emergence of such imagecapture devices suggests new applications for information input anddocument retrieval.

BRIEF SUMMARY OF THE INVENTION

Embodiments of the present invention provide techniques for retrievingelectronic documents based upon images captured using an image capturedevice. Examples of image capture devices include cameras (bothfilm-based and digital), mobile phones, personal data assistants (PDAs),laptops, other portable devices, scanners, etc. equipped with imagecapture capabilities. One or more images captured by a user using animage capture device are used to search a set of electronic documents toretrieve one or more electronic documents corresponding to the capturedimage. The one or more retrieved documents may then be provided to theuser or some other recipient.

According to an embodiment of the present invention, techniques areprovided for retrieving electronic documents. An image captured using animage capture device is received. Contents of the image are extracted. Asearch query is formed based upon the extracted contents. A plurality ofelectronic documents is searched to identify a first electronic documentthat satisfies the search query.

According to another embodiment of the present invention, techniques areprovided for retrieving electronic documents. A first image capturedusing an image capture device is received. Contents of the first imageare extracted. A search query is formed based upon the extractedcontents of the first image. A search is performed to identify a firstset of electronic documents comprising one or more electronic documentsthat satisfy the search query. It is determined if the first set ofelectronic documents comprises a predefined number of electronicdocuments. A second image is requested if the first set of electronicdocuments does not comprise the predefined number of electronicdocuments. Contents of the second image are extracted. A search query isformed based upon the extracted contents of the second image. A searchis performed to identify a second set of electronic documents comprisingone or more electronic documents that satisfy the search query basedupon the extracted contents of the second image.

According to yet another embodiment of the present invention, techniquesare provided for retrieving electronic documents in which a first imagecaptured by using an image capture device is received. Contents of thefirst image are extracted. A search query is formed based upon theextracted contents of the first image. A search is performed using thesearch query formed based upon the extracted contents of the firstimage. (a) Another image is requested if the search does not identify atleast one electronic document. (b) Another image is received. (c)Contents of the another image are extracted. (d) A search query isformed based upon the extracted contents of the another image. (e) Asearch is performed using the search query formed based upon theextracted contents of the another image. (a), (b), (c), (d), and (e) arerepeated until the search identifies at least one electronic documentthat satisfies the search query.

According to yet another embodiment of the present invention, techniquesare provided for retrieving electronic documents in which an imagecaptured using an image capture device is received. A plurality of textpatterns are extracted from the image using an image processingtechnique. A subset of text patterns from the plurality of text patternsare selected such that the plurality of text patterns comprises at leastone text pattern that is not included in the subset. A search query isformed based upon the subset of text patterns. A plurality of electronicdocuments is searched to identify a first electronic document thatsatisfies the search query.

According to yet another embodiment of the present invention, techniquesare provided for retrieving electronic documents in which an imagecaptured using an image capture device is received. A plurality of textpatterns are extracted from the image using an image processingtechnique. A first subset of text patterns is selected from theplurality of text patterns such that the plurality of text patternscomprises at least one text pattern that is not included in the firstsubset. A search query is formed based upon the first subset of textpatterns. A plurality of electronic documents is searched to identify afirst electronic document that satisfies the search query. A search isperformed using the search query formed based upon the first subset oftext patterns. A second subset of text patterns is selected from theplurality of text patterns if the search performed using the firstsubset of text patterns does not identify at least one electronicdocument. A search query is formed based upon the second subset of textpatterns. A search is performed using the search query formed based uponthe second subset of text patterns.

The foregoing, together with other features, embodiments, and advantagesof the present invention, will become more apparent when referring tothe following specification, claims, and accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a simplified block diagram of a system that may incorporate anembodiment of the present invention;

FIG. 2 is a simplified high-level flowchart depicting a method ofretrieving an electronic document based upon a captured image accordingto an embodiment of the present invention;

FIG. 3 is an example of an image captured by an image capture devicethat may be used as input for document retrieval according to anembodiment of the present invention;

FIG. 4 is a simplified high-level flowchart depicting a method ofretrieving an electronic document based upon a captured image andincluding a feedback loop according to an embodiment of the presentinvention;

FIG. 5 is a simplified high-level flowchart depicting a method ofretrieving an electronic document based upon a captured image andincluding a feedback loop for reducing the number of retrieved documentsaccording to an embodiment of the present invention;

FIG. 6 is a simplified high-level flowchart depicting a method ofretrieving an electronic document based upon a captured image using asubset of the extracted text patterns according to an embodiment of thepresent invention;

FIG. 7 is a simplified high-level flowchart depicting a method ofretrieving an electronic document based upon a captured image using aselected search engine according to an embodiment of the presentinvention; and

FIG. 8 is a simplified block diagram of a computer system that may beused to practice an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

In the following description, for the purposes of explanation, specificdetails are set forth in order to provide a thorough understanding ofthe invention. However, it will be apparent that the invention may bepracticed without these specific details.

Embodiments of the present invention provide techniques for retrievingelectronic documents based upon images captured using an image capturedevice. Examples of image capture devices include cameras (bothfilm-based and digital), mobile phones equipped with a camera, personaldata assistants (PDAs) with image capture capabilities, laptops,scanners, and the like. In general, an image capture device may be anydevice that is capable of capturing an image. One or more imagescaptured by a user using an image capture device are used to search aset of electronically stored documents (“electronic documents”) toretrieve one or more documents corresponding to the captured image. Theone or more documents retrieved for the captured image may then beprovided to the user or some other recipient.

FIG. 1 is a simplified block diagram of a system 100 that mayincorporate an embodiment of the present invention. FIG. 1 is merelyillustrative of an embodiment incorporating the present invention anddoes not limit the scope of the invention as recited in the claims. Oneof ordinary skill in the art would recognize other variations,modifications, and alternatives.

A user may use an image capture device 102 to capture an image 104. Asshown in FIG. 1, the user may capture an image 104 of a page (or partialpage) of a paper document 106. Paper document 106 may be any documentprinted on a paper medium. Examples of image capture device 102 includea camera, a mobile phone equipped with image capture capabilities, andthe like. The captured image 104 may be of the entire contents of thepaper document or a portion thereof. It should be apparent that othertypes of images such as images of scenes, objects, etc. may also becaptured using image capture device 102 and used in accordance with theteachings of the present invention.

In one embodiment, image capture device 102 may be configured to processimage 104, form a search query based upon image 104, and perform asearch using a search engine 108 to find one or more electronicdocuments satisfying the search query. The search query may becommunicated to a search engine 108 via communication network 110. In analternative embodiment, part of the processing may be performed byanother server or system 105. For example, as shown in FIG. 1, imagecapture device 102 may communicate image 104 to server 105 viacommunication network 110. Server 105 may be configured to process image104, form a search query based upon image 104, and perform a searchusing a search engine 108 to find one or more electronic documentssatisfying the search query.

Communication network 110 may be a local area network (LAN), a wide areanetwork (WAN), a wireless network, an Intranet, the Internet, a privatenetwork, a public network, a switched network, or any other suitablecommunication network. Communication network 110 may comprise manyinterconnected systems and communication links such as hardwire links,optical links, satellite or other wireless communications links, wavepropagation links, or any other mechanisms for communication ofinformation. Various communication protocols may be used to facilitatecommunication of information via communication network 110, includingTCP/IP, HTTP protocols, extensible markup language (XML), wirelessapplication protocol (WAP), protocols under development by industrystandard organizations, vendor-specific protocols, customized protocols,and others. Images might be transmitted in any of a number of standardfile formats, such as JPEG, GIF, PNG, TIFF, JPEG 2000, and the like.

For example, in one embodiment, communication network 110 may be awireless network. A user may capture an image 104 using a mobile phonecamera. The image may then be processed by the mobile phone camera tofor a search query and the search query may be communicated to a searchengine 108 (or to a server 105 that uses services of a search engine)via the wireless network. In some embodiments, multiple images capturedby image capture device 102 may be communicated to a server 105.

One or more search engines 108 may be available for performing thesearch. A search engine 108 may receive a search query from imagecapture device 102 or from server 105 and search a set of electronicdocuments 112 to identify one or more electronic documents that satisfythe search query. One or more search engines 108 may perform the searchin collaboration. Search engine 108 may be configured to searchelectronic documents 112 that are accessible to the search engine eitherdirectly or via communication network 110. For example, search engine108 may be configured to search electronic documents (e.g., web pages)provided by content providers and available via the Internet. Thesearched documents may also include documents prepared and stored for anentity such as a corporation, office, school government organization,etc. Examples of a search engine include Google™ search engine, searchengines provided by entities such as businesses, companies, etc., Yahoo™search engine, and others.

One or more electronic documents identified by search engine 108 assatisfying or matching the search query may then be provided to the useror some other recipient. In one embodiment, the documents retrieved fromthe search may be communicated to the image capture device that was usedto capture the image used as input for the search. For example, as shownin FIG. 1, an electronic document 114 identified by search engine 108may be wirelessly communicated to image capture device 102. Theretrieved electronic document 114 may then be output to the user via anapplication (e.g., a browser, a word processing application) executingon image capture device 102.

The electronic document retrieved from the search may also be providedto the user by communicating it to some destination (e.g., documentdelivery destination 116 depicted in FIG. 1) other than the imagecapture device. For example, the retrieved document may be sent viaemail to the user's inbox or email address. In a similar manner, variousdestinations and distribution channels may be used to provide theretrieved document to the user. The retrieved document may becommunicated to the user in various different formats such as a webpage, HTML format, Adobe PDF file, Microsoft Word document, etc. Theretrieved electronic document may also be provided to other recipientssuch as other persons, systems, applications, etc.

FIG. 2 is a simplified high-level flowchart 200 depicting a method ofretrieving an electronic document based upon a captured image accordingto an embodiment of the present invention. The method depicted in FIG. 2may be performed by software code modules or instructions executed by aprocessor, hardware modules, or combinations thereof. Flowchart 200depicted in FIG. 2 is merely illustrative of an embodiment of thepresent invention and is not intended to limit the scope of the presentinvention. Other variations, modifications, and alternatives are alsowithin the scope of the present invention. The method depicted in FIG. 2may be adapted to work with different implementation constraints.

As depicted in FIG. 2, processing is initiated upon receiving orobtaining a captured image (step 202). The resolution and quality of theimage may vary depending on the image capture device used to capture theimage and the quality of the subject matter whose image is captured.Different image capture devices may be used to capture the image.

Contents (or portions thereof) of the captured image are then extractedby applying one or more image processing techniques (step 204). Theextracted contents may include text patterns (e.g., words, phrases),image objects, and other objects. In one embodiment, optical characterrecognition (OCR) techniques are used to extract characters from thecaptured image.

A set of text patterns are then determined from the contents extractedin step 204 (step 206). The text patterns may be words or phrases. Inone embodiment, each text line extracted from the contents of the imageis treated as a text pattern. Various techniques may be used todetermine the text patterns. In one embodiment, a word filter may beapplied to the OCR results received in 204, rejecting any characters orwords or marks that are not found in a lexicon (e.g., the Englishlanguage dictionary). In this manner, punctuations, illegible marks,invalid words, etc. that may have been extracted in step 204 areexcluded or filtered out. In some embodiments, steps 204 and 206 may becombined into a single step whereby text patterns are extracted from thecaptured image.

A search query is then formed based upon the set of text patternsdetermined in 206 (step 208). In one embodiment, the search query isformed by conjoining the text patterns determined in 206. For example,the text patterns may be conjoined using the Boolean “and” operation (orset intersection operation). In other embodiments, as described below infurther detail, a subset of the text patterns obtained in 206 may beincluded in the search query.

A search is then performed using the search query formed in 208 toidentify electronic documents that satisfy the search query (step 210).For a search query formed by conjoining multiple text patterns, adocument satisfies the search query if it includes all of the textpatterns. An electronic document satisfies a search query if itsatisfies all the conditions and terms of the search query. One or moresearch engines may be used to perform the search in step 210 using thesearch query. Various different types of search engines may be used toperform the search such as Google™ and the like. The set of documentsthat is searched may also vary depending upon the search engine used.The set of documents determined in 210 may comprise one or moreelectronic documents. Alternatively, no documents may be returned by thesearch if none of the documents in the searched set of electronicdocuments satisfies the search query.

The set of documents retrieved from the search in 210 is then providedto a recipient (step 212). The recipient may be a person, a system, anapplication, or some other entity. Various techniques may be used forproviding the set of documents to the recipient. For example, the set ofdocuments may be communicated to a user of the image capture device thatwas used to capture the image received in 202. Various differenttechniques may be used to provide the retrieved set of documents to theuser. In one embodiment, the retrieved documents may be communicated tothe image capture device used by the user to capture the image that wasused as the input for the search. The retrieved electronic documents maythen be output to the user using an application (e.g., a browser, a wordprocessing application, etc.) executing on the image capture device. Theretrieved set of documents may also be provided to the user bydelivering the documents to some destination other than the imagecapture device. For example, an email comprising the retrieved set ofdocuments may be sent to the inbox or email address of the user. Variousother destinations and delivery channels may be used for providing thedocuments retrieved from the search. The destinations and deliverytechniques may be user-configurable.

The processing depicted in FIG. 2 may be performed automatically uponreceiving an image. In one embodiment, the image capture device may beused to capture the image and communicate the image to a server thatperforms subsequent processing. The server may be configured to processthe image, form a search query, and use a search engine to identifydocuments that satisfy the search query. In other embodiments, the imagecapture device may have sufficient computing prowess to perform some orall of the processing steps depicted in FIG. 2. For example, in additionto capturing the image, the image capture device may be configured toapply OCR techniques to extract image contents, determine a set of textpatterns, form a search query based upon the text patterns, and then usea search engine to find electronic documents that satisfy the searchquery. The electronic documents identified by the search engine may thenbe communicated to the image capture device or to some otherdestination. Accordingly, depending on the processing prowess of theimage capture device, the processing depicted in FIG. 2 may be performedby the image capture device and/or a server.

The processing depicted in FIG. 2 may be illustrated with the followingexample. Assume that image 300 depicted in FIG. 3 is received as inputfor the search. Image 300 is a binarized image that may have beencaptured using an image capture device such as a mobile phone camera. Ascan be seen from FIG. 3, image 300 is of low quality, and only a fewwords can be picked out. The output of OCR applied to image 300 may lookas following:

  ~,~. r~ . . . Camera phones can  ~″~ . . . be channels for viral ,~,:. . . marketing, where ~,~ ,.~   consumers convert ~,,,,,~ . . . theirfriends to new

Applying a word filter to the OCR output, each line may yield thefollowing phrases (text patterns):

“camera phones can”“be channels for viral”“marketing where”“consumers convert”“their friends to new”

A search query may then be formed by conjoining all the text patterns.The search query may look as follows:

“camera phones can” AND “be channels for viral” AND “marketing where”AND “consumers convert” AND “their friends to new”

The search query may then be provided to a search engine such as Google™to retrieve a set of electronic documents that satisfy the search query.The retrieved set of electronic documents may then be provided to theuser.

It is possible in certain instances that no documents are retrieved byapplying the search query. This may occur because the corpus ofdocuments searched did not include a document containing the searchterms. If the image provided for the search is of such poor quality thattext patterns cannot be accurately extracted from the image, theextracted text patterns may be inaccurate and thus result in aninaccurate search query. In order to compensate for errors that mayoccur due to poor image quality or inadequate image processingtechniques that are used to extract contents from the image, a feedbackloop may be provided to enable the user to provide additionalinformation for the search to increase the chances of finding a matchingdocument.

FIG. 4 is a simplified high-level flowchart 400 depicting a method ofretrieving an electronic document based upon a captured image andincluding a feedback loop according to an embodiment of the presentinvention. The method depicted in FIG. 4 may be performed by softwarecode modules or instructions executed by a processor, hardware modules,or combinations thereof. Flowchart 400 depicted in FIG. 4 is merelyillustrative of an embodiment of the present invention and is notintended to limit the scope of the present invention. Other variations,modifications, and alternatives are also within the scope of the presentinvention. The method depicted in FIG. 4 may be adapted to work withdifferent implementation constraints. The processing depicted in FIG. 4may be performed by the image capture device and/or a server using asearch engine.

As depicted in FIG. 4, processing is initiated upon receiving orobtaining a captured image (step 402). One or more image processingtechniques such as OCR techniques are then applied to the captured imageto extract contents (or portions thereof) of the captured image (step404). A set of text patterns is then determined from the contentsextracted in step 404 (step 406). A search query is formed based uponthe text patterns determined in 406 (step 408). A search is thenperformed using the search query formed in 408 (step 410). Theprocessing depicted in steps 402, 404, 406, 408, and 410 of FIG. 4 issimilar to the processing depicted in steps 202, 204, 206, 208, and 210of FIG. 2 and described above.

After a set of electronic documents has been retrieved by running asearch using the search query, a check is made to see if at least onedocument is retrieved as a result of the search (i.e., at least one hitfor the search query) (step 412). If it is determined in 412 that atleast one document is included in the retrieved set of documents, thenthe set of retrieved documents may be provided to a recipient (step414). The recipient may be a person, a system, an application, or someother entity. Various techniques may be used for providing the set ofdocuments to the recipient. For example, the set of documents may becommunicated to a user of the image capture device that was used tocapture the image received in 402. Various techniques may be used toprovide the set of documents to the user.

On the other hand, if it is determined in 412 that not even oneelectronic document was retrieved by the search, then the user isrequested to provide another image to be used as input for the search(step 416). After another image is obtained (step 418), processing thencontinues with step 404 using the new image obtained in 418. A newsearch is then performed based upon the newly obtained image. In thismanner a feedback loop is provided whereby the user can provideadditional images for the search. The feedback loop may be repeateduntil at least one electronic document is identified by the search.

The additional images may be images of the same content as the originalimage or of different content. For example, if the first image was of aportion of a page from an article printed on a paper document, thesecond image provided may be an image of another section of the articleor even of the same portion as the first image but captured such thatthe image quality of the second image is better than the first image.

As described above, in certain instances it is possible that the set ofdocuments retrieved from applying the search query may contain multipledocuments (i.e., multiple hits). This may be inconvenient to the userespecially if the retrieved set comprises several documents. A feedbackloop may be provided to enable the user to provide additionalinformation that can be used to reduce the number of electronicdocuments that are retrieved.

FIG. 5 is a simplified high-level flowchart 500 depicting a method ofretrieving an electronic document based upon a captured image andincluding a feedback loop for reducing the number of retrieved documentsaccording to an embodiment of the present invention. The method depictedin FIG. 5 may be performed by software code modules or instructionsexecuted by a processor, hardware modules, or combinations thereof.Flowchart 500 depicted in FIG. 5 is merely illustrative of an embodimentof the present invention and is not intended to limit the scope of thepresent invention. Other variations, modifications, and alternatives arealso within the scope of the present invention. The method depicted inFIG. 5 may be adapted to work with different implementation constraints.The processing depicted in FIG. 5 may be performed by the image capturedevice and/or a server using a search engine.

As depicted in FIG. 5, processing is initiated upon receiving orobtaining a captured image (step 502). One or more image processingtechniques such as OCR techniques are then applied to the captured imageto extract contents (or portions thereof) of the captured image (step504). A set of text patterns is then determined from the contentsextracted in step 504 (step 506). A search query is formed based uponthe text patterns determined in 506 (step 508). A search is thenperformed using the search query formed in 508 (step 510). Theprocessing depicted in steps 502, 504, 506, 508, and 510 of FIG. 5 issimilar to the processing depicted in steps 202, 204, 206, 208, and 210of FIG. 2 and described above.

After a set of electronic documents has been retrieved from the searchusing the search query, a check is made to see if the retrieved set ofelectronic documents comprises precisely one document (i.e., exactly onehit for the search query) (step 512). If it is determined in 512 thatonly one document is returned by the search, then the one document isprovided to a recipient (step 514). The recipient may be a person, asystem, an application, or some other entity. Various techniques may beused for providing the electronic document to the recipient. Forexample, the electronic document may be communicated to a user of theimage capture device that was used to capture the image received in 502.Various techniques may be used to provide the electronic document to theuser.

On the other hand, if it is determined in 512 that multiple electronicdocuments are retrieved in response to the search then the user isrequested to provide another image to be used as input for the search inorder to reduce the number of retrieved documents (step 516). Afteranother image is obtained (step 518), processing then continues withstep 504 using the new image obtained in 516. Text patterns from the newimage are obtained in 506. A new search query is formed in 508 basedupon the newly obtained image. In one embodiment, in 508, the previouslyapplied search query is augmented with the text patterns extracted fromthe image obtained in 516 (or the search query is formed using somecombination of text patterns extracted from the previous image and thenew image). In this manner, a new and potentially more precise searchquery is formed in 508. The augmented search query is then executed in510. The processing is repeated until only one electronic document isreturned in response to the search query. In another embodiment, in 508,a new search query is formed based solely upon the text patternsextracted from the newly received image in 518 (i.e., the previoussearch patterns are not used). The new search query formed in 508 isthen executed in 510.

In this manner, a feedback loop is provided whereby the user can provideadditional information in the form of additional images to narrow downthe number of electronic documents in the retrieved set of documents.The additional images may be images of different content than theprevious image or even same content. For example, if the first image wasof a portion of a page from an article printed on paper, the secondimage provided may be an image of another section of the article. Inflowchart 500 depicted in FIG. 5 and described above, a threshold of onedocument is used for the check performed in 512. In alternativeembodiments, the check may be performed using some predefined thresholdnumber which may be user-configurable.

A variety of search engines may be used for performing the search. Manysearch engines however limit the number of terms or patterns that can beused at a time in a search query. Google™, for instance, has cut off thenumber of search terms in a query at ten words, having decided that thisnumber is sufficient to uniquely identify a document within theirdatabase. As described above, several text patterns (e.g., more than 10)may be extracted from an image of sufficient resolution. However, it maynot be possible to use all the extracted text patterns in a search queryif the search engine limits the number of terms in a query. Tocompensate for this situation, several pattern/term selection techniquesmay be used to select a subset of text patterns to be used for thesearch. Further, even if the number of terms in a search query is notlimited, the selection techniques may be used to select terms thatimprove the accuracy and reliability of the search. The selectiontechniques may also be used to reduce the number of documents in theretrieved set of documents.

FIG. 6 is a simplified high-level flowchart 600 depicting a method ofretrieving an electronic document based upon a captured image using asubset of the extracted text patterns according to an embodiment of thepresent invention. The method depicted in FIG. 6 may be performed bysoftware code modules or instructions executed by a processor, hardwaremodules, or combinations thereof. Flowchart 600 depicted in FIG. 6 ismerely illustrative of an embodiment of the present invention and is notintended to limit the scope of the present invention. Other variations,modifications, and alternatives are also within the scope of the presentinvention. The method depicted in FIG. 6 may be adapted to work withdifferent implementation constraints. The processing depicted in FIG. 6may be performed by the image capture device and/or a server using asearch engine.

As depicted in FIG. 6, processing is initiated upon receiving orobtaining a captured image (step 602). One or more image processingtechniques such as OCR techniques are then applied to the captured imageto extract contents (or portions thereof) of the captured image (step604). A set of text patterns is then determined from the contentsextracted in step 604 (step 606).

A subset of text patterns is then selected from the set of text patternsdetermined in 606 (step 608). The number of text patterns included inthe subset is less than the number of text patterns in the setdetermined in 606 (i.e., the set of text patterns determined in 606includes at least one text pattern that is not included in the subset).Various different techniques may be used to select text patterns to beincluded in the subset in 608. The text patterns to be included in thesubset may be selected based upon various attributes of or associatedwith the text patterns. According to a first technique, the textpatterns to be included in the subset may be randomly selected from theset of text patterns determined in 606. According to another technique,text patterns may be selected based upon their length. For example, textpatterns with longer lengths may be selected prior to text patterns withshorter lengths. Geometric patterns may also be used to select the textpatterns to be included in the subset.

According to another technique, text patterns may be selected based uponconfidence data associated with the text patterns. In this embodiment,the image processing technique (e.g., an OCR engine) applied in 604 maybe configured to associate confidence data with each of the extractedtext patterns. The confidence data of a text pattern is a measure of thelikelihood that the image processing technique used to extract the textpatterns has achieved a correct conclusion as to the content of the textpattern. Various factors may influence the confidence data assigned fora text pattern. For instance, an image processing program might notethat the image data was extremely noisy, and lower its confidence valuefor the text patterns extracted from the image data. As another example,the image processing technique might note that the included charactersof a text pattern include the lowercase letter “1”, which is notoriouslydifficult to visually distinguish from the digit “1”, and thus theconfidence data associated with that text pattern may be lowered. Basedupon the confidence data, text patterns with high recognition confidencedata are selected first to be included in the subset determined in 608.

A search query is then formed based upon the subset of text patternsdetermined in 608 (step 610). In one embodiment, the search query isformed by conjoining the text patterns determined in 608. A search isthen performed to identify electronic documents that satisfy the searchquery formed in 610 (step 612).

A check is then made to see if the set of electronic documents retrievedfrom the search comprises precisely one document (i.e., exactly one hitfor the search query) (this number is user-configurable) or, if nodocument was retrieved by the search (i.e., the search did not yield anymatch) (step 614). If it is determined in 614 that only one document isincluded in the retrieved set of documents, then the one electronicdocument is provided to a recipient (step 616) and processingterminates. The recipient may be a person, a system, an application, orsome other entity. Various techniques may be used for providing theelectronic document to the recipient. For example, electronic documentmay be communicated to a user of the image capture device that was usedto capture the image received in 602. Various techniques may be used toprovide the electronic document to the user.

On the other hand, if it is determined in 614 either that multipleelectronic documents are retrieved by the search or no document isretrieved then processing continues with step 608 wherein another subsetof text patterns from the set of text patterns determined in 606 isdetermined. The text patterns in the newly selected subset may bedifferent from the text patterns in the previously selected subset(there may however be overlaps between the subsets). For example, iftext patterns with the highest confidence data were selected during thefirst pass of the processing, then in the second pass, the next set oftext patterns based upon confidence data may be selected. Various otherselection techniques may also be used.

Processing then continues with step 610 using the subset determined in608. In this manner, the search may be iteratively performed using textpatterns from the set of text patterns determined in 606. Each iterationmay use a different subset of text patterns for the search. Such aniterative approach provides several advantages including reducing thenumber of times that a user has to re-shoot an image when either a matchcannot be obtained or when too many matches are obtained. The processingis repeated until only one electronic document is returned in responseto the search query.

In specific embodiments of the present invention, the processingdepicted in FIG. 6 may be performed only when the number of textpatterns extracted from the contents of the image is more that thenumber of terms allowed in a search query by the search engine used forperforming the search. For example, a check may be made prior to step608 to determine if the number of extracted text patterns exceeds thenumber of search terms allowed by the search engine. In this embodiment,a subset of text patterns may be selected only upon determining that thenumber of extracted text patterns exceeds the allowed number of searchterms, else all the extracted text patterns may be used in the searchquery.

As shown in FIG. 1 and described above, several search engines may beavailable for performing the search. Different techniques may be used todetermine the one or more search engines to be used for performing thesearch from the several available search engines. FIG. 7 is a simplifiedhigh-level flowchart 700 depicting a method of retrieving an electronicdocument based upon a captured image using selected one or more searchengines according to an embodiment of the present invention. The methoddepicted in FIG. 7 may be performed by software code modules orinstructions executed by a processor, hardware modules, or combinationsthereof. Flowchart 700 depicted in FIG. 7 is merely illustrative of anembodiment of the present invention and is not intended to limit thescope of the present invention. Other variations, modifications, andalternatives are also within the scope of the present invention. Themethod depicted in FIG. 7 may be adapted to work with differentimplementation constraints. The processing depicted in FIG. 7 may beperformed by the image capture device and/or a server using a searchengine.

As depicted in FIG. 7, processing is initiated upon receiving orobtaining a captured image (step 702). One or more image processingtechniques such as OCR techniques are then applied to the captured imageto extract contents (or portions thereof) of the captured image (step704). A set of text patterns is then determined from the contentsextracted in step 704 (step 706). A search query is then formed usingthe text patterns or a subset of the text patterns determine in 706(step 708).

A search engine (or multiple search engines) is then selected forperforming the search (step 710). The search engine may be selected frommultiple search engines that are available for performing the search.Various criteria may be used for determining which search engine to use.In one embodiment, the location (e.g., geographical location) of theuser may be used to automatically select the search engine to be used.For example, a search engine that is proximal to the location of theuser may be preferred to search engines that are further away. In thismanner, local documents provided by the local search engine may beprovided to the user preferentially over other documents.

In another embodiment, a list of available search engines may bedisplayed to the user and the user may then select one or more searchengines to be used. Various technologies such as multicast Domain NameService (DNS), Universal Plug and Play (UPnP), etc. may be used todetect search engine servers. These technologies can be implementedusing wireless networking methods and can be implemented in a securemanner.

In another embodiment, the user may be allowed to designate one or moresearch engines to be used for searching. In yet another embodiment, asearch engine may be selected based upon costs associated with thesearching. For example, a search engine performing the cheapest searchmay be selected. Various other criteria may be used for selecting asearch engine to perform the search.

A search engine may also be selected based upon the context of thesearch to be performed and the documents to be searched. For example, ifdocuments related to an entity such as a business, an office, agovernment agency, etc. are to be searched, then a search engineprovided by that entity may be selected in 710.

The search query formed in 708 is then communicated to the search enginedetermined in 710 (step 712). A search is performed by the search enginedetermined in 710 using the search query formed in 708 (step 714).Results of the search are then communicated from the search engine to arecipient (step 716). The recipient may be a person, a system, anapplication, or some other entity. For example, the search results maybe communicated to a user of the image capture device that was used tocapture the image received in 702. Various techniques may be used toprovide the search results to the user. The documents retrieved by thesearch may be communicated to the image capture device used by the userto capture the image that formed the input for the search. The one ormore documents retrieved by the search may also be delivered to otherdestinations.

In certain embodiments, search engines may be selected and usedaccording to a tiered technique. For example, a first search engine maybe selected during a first pass of the processing, followed by a secondsearch engine during a second pass, and so on. In this manner, severalsearch engines may be used to perform the searches. The search resultsfrom the various search engines may then be gathered and provided to theuser. In this manner electronic documents that are not accessible to onesearch engine may be searched by another search engine. This techniquemay also be used if satisfactory search results are not obtained from asearch engine thereby requiring another search engine to be used. If asingle document must be chosen as the result of a search, and multiplesearch engines have candidate documents, a priority ranking of searchengines may be used to select documents of one search engine overdocuments from a different search engine.

The processing depicted in FIG. 7 and described above may be illustratedwith the following example. Consider a user in a shopping mall. Uponentering a shop, the user sees a brochure for an object sold by theshop. The user may snap a picture of a page of the brochure. Based uponthe location of the user and where the picture is taken, a search engineprovided by the shop may be selected as the search engine to be used(i.e., the search engine that is geographically proximal to the user isselected). The captured image may be wirelessly communicated to a serverprovided by the shop that uses the search engine of the shop. The shopsearch engine may then search and retrieve an electronic copy of thebrochure (or any other document related to the object such as a pricelist, etc.) and communicate the electronic copy to the user wirelessly.

As another example, a doctor's office may be configured to providehealth related information to persons in the doctor's waiting room. Thedoctor's office may provide a search engine that is configured toprovide the health related documents to the user in response to an imageprovided by the user. The image may be for example of some literaturekept in the doctor's office. A business might provide authenticatedsearch access to its employees to obtain electronic documents fromdocuments stored by a document management system used by the business inresponse to receiving images of the documents or portions thereof.

FIG. 8 is a simplified block diagram of a computer system 800 that maybe used to practice an embodiment of the present invention. As shown inFIG. 8, computer system 800 includes a processor 802 that communicateswith a number of peripheral devices via a bus subsystem 804. Theseperipheral devices may include a storage subsystem 806, comprising amemory subsystem 808 and a file storage subsystem 810, user interfaceinput devices 812, user interface output devices 814, and a networkinterface subsystem 816.

Bus subsystem 804 provides a mechanism for letting the variouscomponents and subsystems of computer system 800 communicate with eachother as intended. Although bus subsystem 804 is shown schematically asa single bus, alternative embodiments of the bus subsystem may utilizemultiple busses.

Network interface subsystem 816 provides an interface to other computersystems including various servers, search engines, networks, and capturedevices. Network interface subsystem 816 serves as an interface forreceiving data from and transmitting data to other systems from computersystem 800.

User interface input devices 812 may include a keyboard, pointingdevices such as a mouse, trackball, touchpad, or graphics tablet, ascanner, a barcode scanner, a touchscreen incorporated into the display,audio input devices such as voice recognition systems, microphones, andother types of input devices. In general, use of the term “input device”is intended to include all possible types of devices and mechanisms forinputting information to computer system 800.

User interface output devices 814 may include a display subsystem, aprinter, a fax machine, or non-visual displays such as audio outputdevices, etc. The display subsystem may be a cathode ray tube (CRT), aflat-panel device such as a liquid crystal display (LCD), or aprojection device. In general, use of the term “output device” isintended to include all possible types of devices and mechanisms foroutputting information from computer system 800. User interfacesaccording to the teachings of the present invention may be displayed byuser interface output devices 814.

Storage subsystem 806 may be configured to store the basic programmingand data constructs that provide the functionality of the presentinvention. Software code modules or instructions that provide thefunctionality of the present invention may be stored in storagesubsystem 806. These software code modules or instructions may beexecuted by processor(s) 802. Storage subsystem 806 may also provide arepository for storing data used in accordance with the presentinvention. Storage subsystem 806 may comprise memory subsystem 808 andfile/disk storage subsystem 810.

Memory subsystem 808 may include a number of memories including a mainrandom access memory (RAM) 818 for storage of instructions and dataduring program execution and a read only memory (ROM) 820 in which fixedinstructions are stored. File storage subsystem 810 provides persistent(non-volatile) storage for program and data files, and may include ahard disk drive, a floppy disk drive along with associated removablemedia, a Compact Disk Read Only Memory (CD-ROM) drive, an optical drive,removable media cartridges, and other like storage media.

Computer system 800 can be of various types including a personalcomputer, a portable computer, a workstation, a network computer, amobile phone, a PDA, a mainframe, a kiosk, a camera, an image capturedevice, or any other data processing system. Due to the ever-changingnature of computers and networks, the description of computer system 800depicted in FIG. 8 is intended only as a specific example for purposesof illustrating the preferred embodiment of the computer system. Manyother configurations having more or fewer components than the systemdepicted in FIG. 8 are possible.

According to an embodiment of the present invention, service providersmay provide search services that may be used by users to retrieveelectronic documents in response to images provided by the users. Aservice provider may provide servers that are configured to receiveimages from a user, process the images, perform searches based upon theimages, and communicate the electronic documents found by the searchesto the user. A user may be charged a fee to use the search servicesprovided by the service provider. Various fee structures may be used.For example, in one embodiment, a fee may be charged for each electronicdocument that is found by the search and downloaded by the user.Advertisements and other marketing material may also be provided as partof the document download. As part of the search services, the searchservice providers may provide access to electronic documents thatnormally would not be accessible to the user.

Although specific embodiments of the invention have been described,various modifications, alterations, alternative constructions, andequivalents are also encompassed within the scope of the invention. Forexample, embodiments of the present invention described above performsearches using text patterns. In alternative embodiments, other contentsof the captured image such as images contained within the capturedimage, other objects within the captured image (e.g., multimediaobjects), etc. may also be used to perform the searches. The describedinvention is not restricted to operation within certain specific dataprocessing environments, but is free to operate within a plurality ofdata processing environments.

Additionally, although the present invention has been described using aparticular series of transactions and steps, it should be apparent tothose skilled in the art that the scope of the present invention is notlimited to the described series of transactions and steps. For example,various combinations of the processing depicted in FIGS. 2, 4, 5, 6, and7 may be used in different embodiments of the present invention.

Further, while the present invention has been described using aparticular combination of hardware and software, it should be recognizedthat other combinations of hardware and software are also within thescope of the present invention. The present invention may be implementedonly in hardware, or only in software, or using combinations thereof.

The specification and drawings are, accordingly, to be regarded in anillustrative rather than a restrictive sense. It will, however, beevident that additions, subtractions, deletions, and other modificationsand changes may be made thereunto without departing from the broaderspirit and scope of the invention as set forth in the claims.

1. A device comprising: a processor; and a storage medium coupled withthe processor, wherein the processor is configured to: capture imagedata; form a search query based on the image data; communicate thesearch query to a search engine; and receive one or more electronicdocuments based on the search query.
 2. The device of claim 1 whereinthe processor is further configured to: extract contents from the imagedata; and form the search query based on the extracted content.
 3. Thedevice of claim 1 wherein the search query comprises textualinformation.
 4. The device of claim 1 wherein the processor is furtherconfigured to display the one or more electronic documents on a displayassociated with the device.
 5. The device of claim 1 wherein theprocessor is further configured to: determine a first text string fromthe image data; determine a second text string from the image data; andconjoin the first text string and the second text string to generate athird text string, wherein the search query comprises the third textstring.
 6. The device of claim 5 wherein the processor is furtherconfigured to: determine the first text string and the second textstring using a lexicon; and exclude text from the first text string andthe second text string if the text is not in the lexicon.
 7. The deviceof claim 1 wherein the device comprises a camera or a mobile phoneincluding a camera.
 8. The device of claim 1 wherein the search engineis selected based on geographically proximity of the search engine to alocation of the device at the time of communicating the search query. 9.A method comprising: capturing, by an image capture device, image data;forming, by the image capture device, a search query based on the imagedata; communicating, by the image capture device, the search query to asearch engine; receiving, by the image capture device, one or moredocuments from the search engine in response to the search query. 10.The method of claim 9 further comprising displaying, by the imagecapture device, the one or more documents.
 11. The method of claim 9wherein forming a search query further comprises: extracting, by theimage capture device, content of the image data; and forming, by theimage capture device, the search query based on the content.
 12. Themethod of claim 11 further comprising: analyzing, by the image capturedevice, the content using a text filter to generate filtered content,the text filter being configured to analyze the content based on alexicon; and generating, by the image capture device, the search querybased on the filtered contents, wherein the filtered content excludesany character, word, or mark that is not present in the lexicon.
 13. Themethod of claim 9 further comprising: determining, by the image capturedevice, a first text string from the image data; determining, by theimage capture device, a second text string from the image data; andconjoining, by the image capture device, the first text string and thesecond text string to generate a third text string, wherein the searchquery comprises the third text string.
 14. The method of claim 13wherein the first text string and the second text string is determinedusing a lexicon and wherein any text not present in the lexicon is notincluded in the first text string and the second text string.
 15. Themethod of claim 9 wherein the search query comprises text.
 16. Themethod of claim 9 wherein the image capture device is a mobile phoneincluding a camera.
 17. The method of claim 9 further comprising:communicating, by the image capture device, the search query to a firstsearch engine and a second search engine; receiving, by the imagecapture device, a first set of documents from the first search engineand a second set of documents from the second search engine; andproviding, by the image capture device, the first set of documents andthe second set of documents to a user.
 18. A system comprising: an imagecapture device; and a server; wherein the image capture device isconfigured to: capture image data; form a search query based on theimage data; and communicate the search query to the server; wherein theserver is configured to: identify one or more electronic documents basedon the search query; and communicate the one or more electronicdocuments to the image capture device.
 19. The system of claim 18wherein the image capture device is further configured to: extractcontents from the image data; apply a word filter to the extractedcontents to filter out a character, a mark, or a word that is notpresent in an associated lexicon to generate filtered content; and formthe search query based on the filtered content.
 20. The system of claim18 wherein the image capture device is further configured to: determinea first text string and a second text string from the image data;conjoin the first text string and the second text string to generate athird text string; and include the third text string in the searchquery.